About the data

We will be looking at observations of freshwater fish in Norwegian lakes. Our observations are obtained from GBIF, by first downloading all ray-finned fish (Actinopterygii) observations registered in Norway, and then filtering these based on the following steps:

  1. Filter out species not in list of Norwegian freshwater fish, as described by SNL
  2. Match all species to closest lake, and remove observation further than 10 meters from the closest lake
  3. Remove all observations with no time variable.
  4. Select 12 most prevalent species.

The list of fish we downloaded can be seen below. We excluded fish that spawn in salt water, so some species that may be observed in lakes, such as eel (Anguilla anguilla) and European flounder (Platichthys flesus), have been excluded from our analysis as a result of this.

## Petromyzon marinus, Lampetra fluviatilis, Lampetra planeri, Lampetra japonica, Rutilus rutilus, Leuciscus leuciscus, Leuciscus cephalus, Leuciscus idus, Phoxinus phoxinus, Scardinius erythrophthalmus, Aspius aspius, Tinca tinca, Alburnus alburnus, Blicca bjoerkna, Abramis brama, Carassius carassius, Carassius auratus, Cyprinus carpio, Gobio gobio, Leucaspius delineatus, Ictalurus nebulosus, Esox lucius, Osmerus eperlanus, Salmo salar, Salmo trutta, Oncorhynchus mykiss, Oncorhynchus gorbuscha, Oncorhynchus keta, Salvelinus alpinus, Salvelinus fontinalis, Salvelinus namaycush, Coregonus lavaretus, Coregonus albula, Thymallus thymallus, Lota lota, Gasterosteus aculeatus, Pungitius pungitius, Cottus gobio, Cottus poecilopus, Myoxocephalus quadricornis, Perca fluviatilis, Sander lucioperca, Gymnocephalus cernuus, Lepomis gibbosus

The resulting data set consists of 92399 fish observations in 29749 lakes. Next, we will take a look at these fish distributed across species and time.

Preliminary insights

To begin with, for the sake of us non-biologists, here are the latin, English and Norwegian names of the twelve included species.

Let us take a look at the number of observations of each species.

Here we see that Salmo trutta (trout) outranks the other species by quite a bit, although Perca fluviatilis (European perch) and Salvelinus alpinus (arctic char) are also quite prevalent.

It is also interesting to take a look at when the observations were made. An important characteristic of our data is the fact that the observations appear at very irregular times, there are some years when there seem to be large studies of many lakes, while other years there are only a few single observations. The earliest observation is made in 1877, but the first decades are mostly just observations made in single lakes and are not very numerous. We look at the observation counts for the years after 1970, which is when observations really start increasing.

So 1993 was the big lakefish year, especially for Salmo trutta and Salvelinus alpinus. It is also interesting to note that there are relatively many observations of Coregunus albula and Coregonus lavaretus during the period 2002 to 2005, and barely any at all the other years. The Perca fluviatilis is also interesting in the sense that the species are fairly stable over time, there are quite a few observations every year, but never any huge spikes such as S. trutta and S. alpinus have in 1993.

Lastly, we also take a look at where the observations are made.

Plotting data on the map

First, let us look at all the observations on a map of Norway.

So, we can definitely see something here, but it is a bit of a mess. We have added some transparency to the points, but they are still very much on top of each other, so this does not necessarily convey what we want it to. Let us facet it into one map per species to tidy it up a bit.

Here, it is interesting to note that different species seem to have very distinct locations. This could simply be due to that there may be studies targeting specific species in some areas, but maybe it also reflects to some degree where the species live. Specifically, we see a lot of species that are only (or mainly) in eastern Norway (Finnmark and Hedmark), while others, such as S. trutta, have a good spread. There are some exceptions to this: S. fontinalis is only in the very south of Norway, C. lavaretus is more in the central part of southern Norway, and G. aculeatus is mostly in western Norway. Also, S. alpinus is mostly in the north of Norway. It would be interesting to examine if this cioincides with common knowledge.

In order to introduce another dimension here, let us look at the data animated over time.

Again, let us facet this by species in order to see a bit more detail.

This is a little easier to look at.

It is slightly worrying to note here that there seem to be observations in Sweden. This is especially worrying since this was not the case in the non-faceted animation. This should be examined further.

When we looked at the bar plot of observations per year at the beginning of this report, we noted that Perca fluviatilis was one of few species that had fairly stable observation numbers across years. It may be interesting to take a closer look at these.

When looking at the animation, both of Perca fluviatis and of the other species, there sometimes seems to be “walking” observations. We can focus on the period 1930 to 1945 for P. fluviatilis, where there is a distinct walk. When we filter the data set for P. fluviatilis and the period 1925 to 1950, we see that there are in fact only two observations there, and none of them between 1930 and 1940. This is worrying. This, and the above suspicion based on how the faceted and non-faceted plots do not seem to display the same information, leads us to conclude that there must be something with the animation that we are not implementing correctly. We will look further into this at a later time.

## Simple feature collection with 2 features and 3 fields
## geometry type:  POINT
## dimension:      XY
## bbox:           xmin: 253503.2 ymin: 6697295 xmax: 369101.1 ymax: 6785476
## epsg (SRID):    25833
## proj4string:    +proj=utm +zone=33 +ellps=GRS80 +towgs84=0,0,0,0,0,0,0 +units=m +no_defs
##   year waterBody           species                 geometry
## 1 1945    Røgden Perca fluviatilis POINT (369101.1 6697295)
## 2 1926     Mjösa Perca fluviatilis POINT (253503.2 6785476)

In the animations above we got a sense of the observations over time (assuming they are not completely wrong), but the constant movement of the animation makes it a bit difficult to absorb. Here we have divided all the observations from 1970 to 2010 into four sections: 1971 - 1980, 1981 - 1990, 1991 - 2000 and 2001 - 2010, and we plot all species for each time period.

From these plots we see that a large portion of the observations were done between 1990 and 2000, which is expected after looking at the bar plot of the years with the large spike in 1993. There also seems to be a pattern with a large area in the middle of southern Norway having close to no observations in this time period. This area does however seem to have the majority of the observations in the next time period. This could indicate that some type of organized survey or observation effort was done in this time period, perhaps to fill this geographic hole in the data.

By counting the number of unique species per lake we can find the species richness per lake.

Based on the bar plot of locations, it is no surprise that Finnmark and Hedmark have species rich lakes.

Here are the top ten species rich lakes:

##        waterBody species_richness
## 1          Mjösa               10
## 2  Buolbmatjavri                9
## 3    Hurdalsjøen                9
## 4        Ossjøen                9
## 5     Vestvatnet                9
## 6       Rømsjøen                8
## 7    Vingersjøen                8
## 8       Øvrevatn                8
## 9      Storsjøen                8
## 10     Nedrevatn                8

Reference for data:

## GBIF Occurrence Download https://doi.org/10.15468/dl.ugszgk accessed via GBIF.org on 2019-11-07